智能论文笔记

Progressive Multi-resolution Loss for Crowd Counting

Ziheng Yan , Yuankai Qi , Guorong Li , Xinyan Liu , Weigang Zhang , Qingming Huang , Ming-Hsuan Yang

分类：计算机视觉

2022-12-08

Crowd counting is usually handled in a density map regression fashion, which is supervised via a L2 loss between the predicted density map and ground truth. To effectively regulate models, various improved L2 loss functions have been proposed to find a better correspondence between predicted density and annotation positions. In this paper, we propose to predict the density map at one resolution but measure the density map at multiple resolutions. By maximizing the posterior probability in such a setting, we obtain a log-formed multi-resolution L2-difference loss, where the traditional single-resolution L2 loss is its particular case. We mathematically prove it is superior to a single-resolution L2 loss. Without bells and whistles, the proposed loss substantially improves several baselines and performs favorably compared to state-of-the-art methods on four crowd counting datasets, ShanghaiTech A & B, UCF-QNRF, and JHU-Crowd++.

translated by 谷歌翻译

Prompt Combines Paraphrase: Teaching Pre-trained Models to Understand Rare Biomedical Words

Haochun Wang , Chi Liu , Nuwa Xi , Sendong Zhao , Meizhi Ju , Shiwei Zhang , Ziheng Zhang , Yefeng Zheng , Bing Qin , Ting Liu

分类：自然语言处理

2022-09-14

事实证明，对预训练的模型进行迅速基于基于预训练的模型的微调对许多自然语言处理任务有效。但是，尚未对生物医学领域的迅速进行调整。生物医学单词在一般领域通常很少见，但在生物医学环境中无处不在，这在微观调整后即使在下游生物医学应用上都显着恶化了预训练的模型的性能，尤其是在低资源场景中。我们提出了一种简单而有效的方法，可以帮助模型在迅速调整过程中学习稀有的生物医学单词。实验结果表明，我们的方法可以使用少量的香草提示设置，无需任何额外的参数或培训步骤即可提高生物医学自然推理任务6％。

translated by 谷歌翻译

Multi-modal Contrastive Representation Learning for Entity Alignment

Zhenxi Lin , Ziheng Zhang , Meng Wang , Yinghui Shi , Xian Wu , Yefeng Zheng

分类：自然语言处理 | 人工智能 | 机器学习

2022-09-02

多模式实体对齐旨在确定两个不同的多模式知识图之间的等效实体，这些实体由与实体相关的结构三元组和图像组成。大多数先前的作品都集中在如何利用和编码不同模式中的信息，而由于模态异质性，因此在实体对齐中利用多模式知识并不是微不足道的。在本文中，我们提出了基于多模式对比度学习的实体比对模型McLea，以获得多模式实体对准的有效联合表示。与以前的工作不同，麦克莱尔（McLea）考虑了面向任务的模式，并为每个实体表示形式建模模式间关系。特别是，麦克莱（McLea）首先从多种模式中学习多个单独的表示，然后进行对比学习以共同对模式内和模式间相互作用进行建模。广泛的实验结果表明，在受监督和无监督的设置下，MCLEA在公共数据集上优于公共数据集的最先进的基线。

translated by 谷歌翻译

HTML版本

GREASE: Generate Factual and Counterfactual Explanations for GNN-based Recommendations

Ziheng Chen , Fabrizio Silvestri , Jia Wang , Yongfeng Zhang , Zhenhua Huang , Hongshik Ahn , Gabriele Tolomei

分类：人工智能 | 机器学习

2022-08-04

最近，图形神经网络（GNN）已被广泛用于开发成功的推荐系统。尽管功能强大，但基于GNN的建议系统很难附上明显的解释，说明为什么特定项目最终在给定用户的建议列表中。确实，解释基于GNN的建议是独特的，而现有的GNN解释方法是不合适的，原因有两个。首先，传统的GNN解释方法是为节点，边缘或图形分类任务而不是排名而设计的，如推荐系统中。其次，标准的机器学习解释通常旨在支持熟练的决策者。相反，建议是为任何最终用户设计的，因此应以用户理解的方式提供其解释。在这项工作中，我们提出了润滑脂，这是一种新的方法，用于解释任何基于黑盒GNN的建议系统提供的建议。具体而言，Grease首先在目标用户项目对及其$ L $ -HOP社区上训练替代模型。然后，它通过找到最佳的邻接矩阵扰动来捕获足够和必要的条件，分别推荐一个项目，从而生成事实和反事实解释。在现实世界数据集上进行的实验结果表明，油脂可以为流行的基于GNN的推荐模型产生简洁有效的解释。

translated by 谷歌翻译

Neuro-Symbolic Learning: Principles and Applications in Ophthalmology

Muhammad Hassan , Haifei Guan , Aikaterini Melliou , Yuqi Wang , Qianhui Sun , Sen Zeng , Wen Liang , Yiwei Zhang , Ziheng Zhang , Qiuyue Hu

分类：计算机视觉 | 人工智能 | 机器学习

2022-07-31

近年来，随着新颖的策略和应用，神经网络一直在迅速扩展。然而，尽管不可避免地会针对关键应用程序来解决这些挑战，例如神经网络技术诸如神经网络技术中仍未解决诸如神经网络技术的挑战。已经尝试通过用符号表示来表示和嵌入域知识来克服神经网络计算中的挑战。因此，出现了神经符号学习（Nesyl）概念，其中结合了符号表示的各个方面，并将常识带入神经网络（Nesyl）。在可解释性，推理和解释性至关重要的领域中，例如视频和图像字幕，提问和推理，健康信息学和基因组学，Nesyl表现出了有希望的结果。这篇综述介绍了一项有关最先进的Nesyl方法的全面调查，其原理，机器和深度学习算法的进步，诸如Opthalmology之类的应用以及最重要的是该新兴领域的未来观点。

translated by 谷歌翻译

Deep Manifold Learning with Graph Mining

Xuelong Li , Ziheng Jiao , Hongyuan Zhang , Rui Zhang

分类：机器学习 | 人工智能

2022-07-18

诚然，图形卷积网络（GCN）在图形数据集（例如社交网络，引文网络等）上取得了出色的结果。但是，通过梯度下降，使用数千次迭代来优化这些框架中的SoftMax作为决策层。此外，由于忽略了图节点的内部分布，决策层可能会导致半监督学习中的性能不令人满意，而标签支持较少。为了解决引用的问题，我们提出了一个新颖的图形模型，该模型具有用于图挖掘的非梯度决策层。首先，流形学习与标签局部结构保存统一，以捕获节点的拓扑信息。此外，由于非梯度特性，封闭式解决方案被用作GCN的决策层。特别是，为该图模型设计了一种联合优化方法，该方法极大地加速了模型的收敛性。最后，广泛的实验表明，与当前模型相比，所提出的模型已经达到了最先进的性能。

translated by 谷歌翻译

K-Space Transformer for Fast MRIReconstruction with Implicit Representation

Ziheng Zhao , Tianjiao Zhang , Weidi Xie , Yanfeng Wang , Ya Zhang

分类：计算机视觉

2022-06-14

本文考虑了快速MRI重建的问题。我们提出了一个基于变压器的新型框架，用于直接处理K空间中稀疏采样的信号，超出了像Convnets一样的常规网格的限制。我们采用频谱图的隐式表示，将空间坐标视为输入，并动态查询部分观察到的测量值以完成频谱图，即学习K空间中的电感偏置。为了在计算成本和重建质量之间保持平衡，我们分别建立了一个具有低分辨率和高分辨率解码器的层次结构。为了验证我们提出的模块的必要性，我们在两个公共数据集上进行了广泛的实验，并表现出优于最先进方法的卓越或可比性。

translated by 谷歌翻译

SPOC learner's final grade prediction based on a novel sampling batch normalization embedded neural network method

Zhuonan Liang , Ziheng Liu , Huaze Shi , Yunlong Chen , Yanbin Cai , Yating Liang , Yafan Feng , Yuqing Yang , Jing Zhang , Peng Fu

分类：计算机视觉

2020-12-15

Recent years have witnessed the rapid growth of Small Private Online Courses (SPOC) which is able to highly customized and personalized to adapt variable educational requests, in which machine learning techniques are explored to summarize and predict the learner's performance, mostly focus on the final grade. However, the problem is that the final grade of learners on SPOC is generally seriously imbalance which handicaps the training of prediction model. To solve this problem, a sampling batch normalization embedded deep neural network (SBNEDNN) method is developed in this paper. First, a combined indicator is defined to measure the distribution of the data, then a rule is established to guide the sampling process. Second, the batch normalization (BN) modified layers are embedded into full connected neural network to solve the data imbalanced problem. Experimental results with other three deep learning methods demonstrates the superiority of the proposed method.

translated by 谷歌翻译

Objective Surgical Skills Assessment and Tool Localization: Results from the MICCAI 2021 SimSurgSkill Challenge

Aneeq Zia , Kiran Bhattacharyya , Xi Liu , Ziheng Wang , Max Berniker , Satoshi Kondo , Emanuele Colleoni , Dimitris Psychogyios , Yueming Jin , Jinfan Zhou

分类：计算机视觉

2022-12-08

Timely and effective feedback within surgical training plays a critical role in developing the skills required to perform safe and efficient surgery. Feedback from expert surgeons, while especially valuable in this regard, is challenging to acquire due to their typically busy schedules, and may be subject to biases. Formal assessment procedures like OSATS and GEARS attempt to provide objective measures of skill, but remain time-consuming. With advances in machine learning there is an opportunity for fast and objective automated feedback on technical skills. The SimSurgSkill 2021 challenge (hosted as a sub-challenge of EndoVis at MICCAI 2021) aimed to promote and foster work in this endeavor. Using virtual reality (VR) surgical tasks, competitors were tasked with localizing instruments and predicting surgical skill. Here we summarize the winning approaches and how they performed. Using this publicly available dataset and results as a springboard, future work may enable more efficient training of surgeons with advances in surgical data science. The dataset can be accessed from https://console.cloud.google.com/storage/browser/isi-simsurgskill-2021.

translated by 谷歌翻译

YOLOX-PAI: An Improved YOLOX Version by PAI

Xinyi Zou , Ziheng Wu , Wenmeng Zhou , Jun Huang

分类：计算机视觉

2022-08-27

我们开发一个名为EasyCV的多合一计算机视觉工具箱，以促进使用各种SOTA计算机视觉方法。最近，我们将Yolox的Yolox-Pai（Yolox的改进版本）添加到EasyCV中。我们进行消融研究以研究某些检测方法对YOLOX的影响。我们还为Pai-blade提供了一种易于使用，用于加速基于Bladedisc和Tensorrt的推理过程。最后，在单个NVIDIA V100 GPU上，我们在1.0毫秒内收到可可延迟的42.8映射，该MAP比Yolov6快一点。简单但有效的预测变量API也在EasyCV中设计，以进行END2END对象检测。现在可以在以下网址获得代码和模型，请访问：https：//github.com/alibaba/easycv。

translated by 谷歌翻译